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Abstract. Bayesian inference is often used in cosmology and astrophysics to derive con¬ 
straints on model parameters from observations. This approach relies on the ability to 
compute the likelihood of the data given a choice of model parameters. In many prac¬ 
tical situations, the likelihood function may however be unavailable or intractable due to 
non-gaussian errors, non-linear measurements processes, or complex data formats such as 
catalogs and maps. In these cases, the simulation of mock data sets can often be made 
through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can 
be used in these cases to derive an approximation to the posterior constraints using simulated 
data sets. This technique relies on the sampling of the parameter set, a distance metric to 
quantify the difference between the observation and the simulations and summary statistics 
to compress the information in the data. We first review the principles of ABC and discuss 
its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis 
distance metric. We test the performance of the implementation using a Gaussian toy model. 
We then apply the ABC technique to the practical case of the calibration of image simula¬ 
tions for wide field cosmological surveys. We find that the ABC analysis is able to provide 
reliable parameter constraints for this problem and is therefore a promising technique for 
other applications in cosmology and astrophysics. Our implementation of the ABC PMC 
method is made available via a public code release. 
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1 Introduction 


Bayesian inference is commonly used in cosmology and astrophysics to derive constraints on 
the parameters of a model from observations [e.g. 1,2]. In this framework, the posterior dis¬ 
tribution of the model parameters given the observed data is derived from a prior and from 
the likelihood of the data given a choice of model parameters. Various sampling techniques 
are used to constrain high dimensional parameter spaces such as the Metropolis Hastings 
[3, 4], Gibbs sampling [5], Nested sampling [6], Hamiltonian/Hybrid Monte Carlo [7], Se¬ 
quential Monte Carlo [8, 9] or the more recent affine invariant ensemble sampling method 
[10, 11]. Various software packages in cosmology have implemented these algorithms and 
are being used for different applications. Prominent examples are the cosmomc package [12], 
based on Metropolis Hasting, cosmopmc [13] which is based on Population Monte Carlo and 
Cosmo Hammer [14] that is built on an ensemble sampling algorithm. Popular codes based 
on Nested sampling are Multinest [15] or PolyChord [16]. More recently, packages have 
appeared that allow the user to switch between different sampling methods, such as Monte 
Python, Cosmo++ or Cosmosis [17-19]. 

Bayesian inference relies on the ability to compute the likelihood function. In practice, 
there are however situations in which the likelihood function is unknown or computationally 
intractable and where the direct Bayesian analysis is therefore not possible. In some of 
these instances, however, the simulation of mock data sets can be made through forward 
modeling [e.g. 20-23]. The simulations may include a model of the astrophysical signal, 
the instrument and observing conditions, and of the data analysis pipeline. For example, 
N-body simulations, semi-analytical models and image simulations can be used to generate 
simulated galaxy catalogues [e.g. 24]. Another example is the simulation of weak lensing 
or other extra-galactic maps which are characterised by non-gaussian statistics [e.g. 25-28], 
or image simulations for weak lensing measurements [e.g. 29-33]. In all these examples the 
likelihood function is not tractable due to the highly non-gaussian nature of the signal, the 
fact that the data is in the form of a catalogue or maps and that the model and measurement 
process is non-linear. 

In recent years, a new technique, known as Approximate Bayesian Computation (ABC), 
has gained attention in various fields such as population genetics, computational biology, 
ecology and psychology [34-43] . This method uses simulated data sets to bypass the need to 
evaluate a likelihood function. ABC systematically explores the prior model parameter space 
and compares the simulated and observed data sets using a distance metric. By accepting 
samples for which this distance metric is smaller than a given threshold, the method provides 
an approximation to the Bayesian posterior distribution. In addition, a summary statistic 
can be used to further compress the information in the data. Different algorithms have been 
proposed for performing these calculations efficiently. These include Importance Sampling, 
Markov Chain Monte Carlo, and different Sequential Monte Carlo methods [34-39, 44]. 

ABC methods have also started to be applied to problems in astronomy. For instance. 
Sequential Monte Carlo algorithms have been used for the model analysis of morphological 
transformation of galaxies [45] , the estimation of the luminosity function [46] and the inference 
of cosmological parameters using TYPE la supernovae [47]. Furthermore, a Markov Chain 
Monte Carlo variant of ABC was used to constrain the disk formation of the milky way 
[48]. Simultaneously to this work, [49] has published a software framework for likelihood- 
free inference based on ABC under the Cosmostatistics Initiative and [50] applied the ABC 
scheme to predict weak-lensing peak counts. 
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In this paper, we explore the use of ABC for forward modeling in cosmology. After 
reviewing the principles of ABC, we consider its implementation using a Population Monte- 
Carlo algorithm. We test the resulting implementation using a Gaussian toy model. We then 
present an application of ABC to the calibration of wide field image simulations. We do this 
by calibrating image simulations generated by the UFig software (Ultra Fast Image Genera¬ 
tor) [51] making use of the Mahalanobis distance metric [52]. This represents a refinement of 
Monte Carlo Control Loops [32], a forward modeling calibration method for weak lensing and 
other cosmological probes. The calibration finds a parameter configuration of UFig that min¬ 
imizes the difference between the simulations and the reference image using their statistical 
properties. This is important since a number of current high precision cosmological probes 
need to be tested for robustness against possible sources of systematics. These systematic 
errors can be numerous in nature and can also couple to each other in complex non-linear 
ways that are only evident when a realistic measurement is attempted. For these purposes 
creating simulations that have the same statistical properties as the data becomes crucial. 

This paper is organized as follows. In section 2.1, we discuss the principles of Ap¬ 
proximate Bayesian Computation and a particle based ABC algorithm as well as important 
performance considerations. Section 3 compares Bayesian with ABC analysis on a Gaussian 
toy model. In section 4 we introduce the problem of image simulation calibration, discuss 
a distance metric to compare multivariate data sets and show results. Our conclusions are 
summarised in section 5. We release a Python implementation of the ABC Population Monte- 
Carlo algorithm under GPLv3 license. Further details can be found in the Appendix A. 


2 Approximate Bayesian Computation 


Let us consider a data set y and a model parametrised by a set of parameters 0. From Bayes’ 
theorem, the posterior probability of the model given the data is 


pi^ly) 


p{y\9)p{9) 

p{y) 


( 2 . 1 ) 


where p{y\0) is the likelihood probability of the data given the model, p{9) is the prior prob¬ 
ability of the model and the normalisation comes from p(^), the evidence. This expression 
can be used to derive the posterior from the likelihood and the prior. 


2.1 Principles 

Standard Bayesian inference relies on the evaluation of a likelihood. However, such a function 
is often not available for simulation-based models. In Approximate Bayesian Computation 
this problem is bypassed by considering a distance metric p(x, y) that quantifies the difference 
between a simulated (x) and an observed (y) dataset. ABC algorithms sample the prior, 
p(0), and a candidate parameter 0* is accepted and retained as sample of the approximated 
posterior, if the distance p{x^y) between x and y is less than a specified threshold e [34, 35]. 
For small values of e, the ABC approximation to the posterior p{9\y) is 


p{9\y) - p{9\p{x, y) < e), (2.2) 

For complex data, it can be difficult or computationally expensive to calculate the 
distance p(x, y) using all the information available in x and y. Therefore, it is often useful to 
focus on summary statistics, S{x) and S{y)^ that capture the important features of the data 
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such as the means and standard deviations. If a summary statistic contains the same amount 
of information about the model parameters as the whole data set, i.e. iip{6\y) — p{0\S{y))^ it 
is referred to as being a sufficient statistic. If not it will lead to weaker posterior constraints. 
The ABC approximation to the posterior then becomes 

p{e\y) ~ p{e\p{S{x),S{y)) < e). (2.3) 

Sequential importance sampling (SIS) [34, 35] is among one of the first algorithms 
proposed for ABC calculations. This method works by exploring the prior in parameter 
space and discarding all proposed points that do not fulfill the criterion p(x, y) < e. For 
small values of e, this method can become inefficient since the rejection rate can get very 
high. In 2003, ABC methods based on Markov Chain Monte Carlo (MCMC) were proposed 
[36], which improved the sampling efficiency. More recently algorithms using Sequential 
Monte Carlo (SMC) with particle filtering [37-39, 44] and [for a review and tutorial see 43] 
have gained growing attention. 

Advanced algorithms such as ABC Population Monte Carlo (ABC PMC) are based on 
Sequential Monte Carlo (SMC) methods and try to circumvent the sampling inefficiency by 
constructing a series of intermediate distributions. These start from the prior distribution 
and converge to the approximate posterior (eq. 2.2) by sampling an intermediate distribution 
for a sequence of gradually decreasing thresholds e^. Besides the mentioned ABC methods 
further algorithms and alternative likelihood-free frameworks exist (see [43] and references 
therein). 

In the following, we adopt the ABC PMC algorithm, as it requires only a small number 
of user supplied tuning parameters. We now discuss the details of this algorithm. 

2.2 ABC PMC algorithm 

The ABC PMC algorithm works with a large set of solution candidates, referred to as a 
‘pooh in the following. Each candidate 0* represents a position in parameter space and 
is referred to as a ‘particle’. The algorithm first generates an initial pool of N particles, 
typically by randomly sampling from the prior p(0), until all candidates fulfil the criteria 
p{x^y) < Co, where cq is an initial threshold. Each particle in the pool is then assigned 
an initial weight In subsequent iterations, the algorithm moves to more stringent 

thresholds, resamples using the pool and updates the weights so that the particles sample 
the desired approximate posterior (eq. 2.3). 

The algorithm randomly samples from the pool taking into account the probability of 
each particle given by the assigned weight. Each sampled particle 0* is then perturbed by 
randomly drawing from a Gaussian distribution with mean 0* and a covariance matrix E 
estimated from the particle positions of the current pool. The new particle 0** is used to 
simulate a data set x and if the data set passes the criterion p{x^y) < et the particle 0** is 
accepted {et is the threshold chosen for iteration t). If particles are rejected, the process is 
repeated so as to maintain a constant population size of the pool. Finally, new weights are 
assigned to all the particles, which then determines the probability of being drawn in the 
next iteration. 

The weights allow the algorithm to favor particles from regions with high-probability 
and to reject particles from low-probability regions of the parameter space. The choice of 
weight uoi has an important impact on the efficiency of the algorithm. Originally, proposed 
by [39], the weight, in ABC PMC for particle Oi^t at iteration t is defined as 
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( 2 . 4 ) 


where piOi^t) is the prior evaluated at position and Oj^t-i are the weights 

and the particle position from the previous iteration, respectively. Finally, q{’\9i^t^^t) is a 
Gaussian kernel with mean 9i^t and covariance matrix The matrix is usually defined 
as twice the weighted variance of the particles 9i^t^. Alternatives to eq. 2.4 exist, such 
as replacing the Gaussian kernel with a multivariate Student-t distribution [53] or using an 
adaptive weighting scheme [54]. 

Algorithm 1 gives a schematic view of the ABC PMC algorithm. A Python package 
containing an implementation of the algorithm is made publicly available under GPLv3 
license. Its installation, instructions and documentation are described in detail in Appendix 

A. 


Data: y, tolerance thresholds e^, prior distribution p{9) 

Set t = 0; 

for z = 0 to do 

while p{x^y) > et do 

Sample 0* from the prior: 0* ^ p{0)] 

Create dataset x from 0* : x ^ Model(9'')] 
end 

Set 9i^t = 9*] 

Set ~ 5 

end 


Set St = 2 X S(0o:W,z); 

for t = 1 to T do 
for z = 0 to do 

while p(x, y) > et do 

Sample 0* from the previous iteration: 0* ^ ^o:W,z-i with weights 
Perturb 9*: 0** ^ A/’(0*,St); 

Create dataset x from 0** : x ^ Model{9*"^)] 

end 


Set 9i^t — ^ 
Set LJi^t = 






end 

Set St = 2 X (00:^,^) 5 

end 

Algorithm 1: ABC Population Monte Carlo Algorithm (adapted from [43]). 
the weighted empirical covariance. 


is 


Particle-based Sequential Monte Carlo algorithms offer advantages over commonly used 
MCMC algorithms. They are less likely to get stuck in low probability regions of parameter 
space and they reduce the difficulty of assessing the convergence of the sampling process. Both 
are especially true if the algorithms are used for ABC. Furthermore, the PMC algorithm can 
be trivially parallelized, which only advanced MCMC algorithms are suitable for [11, 14]. 
Particle based SMC algorithms are easily parallelizable by construction, as the process of 
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proposing and evaluating a new particle is independent of the remaining particle pool. Hence 
each particle could in principle be assigned to one CPU core. 

Despite having shown to produce reliable results [e.g. 37-41, 47, 50], it should be noted, 
however, that regular Bayesian inference should be favoured over the ABC method if an 
efficient likelihood evaluation is possible, as it usually offers better performance. 

2.3 Specific implementation 

For complex models the wall time is typically driven by the number of evaluations of the 
simulations. Therefore, having a high acceptance ratio is crucial in order to reduce the 
required time. The acceptance ratio in turn is driven by the choices of the thresholds as 
well as of the applied perturbation kernel. Using the appropriate balance in decreasing the 
threshold is important: if the decrease is slow, we expect a high acceptance ratio but on 
the other hand the true posterior is approximated only slowly. Decreasing the threshold fast 
results in a fast approximation of the posterior at the price of having a low acceptance ratio. 

Often the series of e is manually selected, which can be difficult to define and may lead 
to a low acceptance ratio or a poor approximation of the posterior. Instead, an adaptive 
choice of the threshold is preferable. It has been proposed that the threshold ct should be 
set as the (a^^-percentile of the sorted particle distances p{x^y) from the previous iteration 
[39, 55], where cr is a user defined value typically between 75 and 90. We find that this yields 
good results, both in terms of the acceptance ratio as well as of the final, approximated 
posterior. It has to be noted, however, that this approach can lead to a poor approximation 
to the posterior in certain cases [56]. 

The fact that the threshold is gradually adapted in the ABC PMC algorithm can be used 
to assess and monitor the convergence during the sampling process [40]. As et approaches 
small values in practice, the approximated posterior (eq. 2.2) stabilizes and does not vary 
much in the subsequent iterations. A further reduction of the threshold et does typically 
not improve the approximation significantly but will decrease the acceptance ratio [40, 50]. 
In practical applications, a lower limit of the acceptance ratio has been applied as stopping 
criterion for the sampling [49, 50]. Further more theoretical convergence measurements have 
also been studied in [57]. Depending on the definition of the distance p(x, y) the convergence 
of the sampling can be derived from the distribution of the distances. In particular if the 
expected variance of the stochastic model exceeds the threshold e no further improvement of 
the approximation should be expected. Other convergence criteria have also been proposed, 
such as monitoring changes in the Kullback-Leibler (KL) divergence of the target densities 
[53] or thresholding on the so-called effective sample size (ESS) [44, 53]. 

The original algorithm proposed by [39] uses a Gaussian distribution with mean 9i^t ^^nd 
twice the weighted covariance matrix of the particles as perturbation kernel. This choice 
minimizes the Kullback-Leibler distance between the desired posterior and the proposal dis¬ 
tribution, which in turn maximizes the acceptance probability. 

Lately, an alternative perturbation kernel has been proposed, which improves the ac¬ 
ceptance ratio especially in non-linear, highly correlated parameter spaces [58]. The optimal 
local covariance matrix (OLCM) kernel is different for every particle It uses a multivari¬ 
ate normal distribution with a covariance matrix based on a subset of the particles from the 
previous iteration, whose distances are smaller than the threshold et of the current iteration: 

I {0k,t, Wfc,t) = I ^) s.t. p{x, y) < et, 0 < i < ivj (2.5) 


- 6 - 


(2.6) 


No 

~ X] - lj){h - Ij)'^ + (m - - Oi^tf 


k=0 


where re is a normalization constant defined such that ~ ^ ~ 

The covariance matrix X 51 . ^ in eq. 2.6 is additionally corrected using a bias term to compen¬ 
sate the discrepancy between the mean of the particle population and the current particle 
Oi^t (See [58], Section 4.3.2 for detailed explanation on this kernel). Our experiments have 
shown that the OLCM kernel is able to increase the acceptance ratio while having a good 
exploration of the parameter space. 


3 Gaussian toy model 

Let us consider the case where the data y = {^ 1 , 2 / 2 , 2/n} consists of independent and 

identically distributed (IID) samples yi drawn from a normal distribution with mean 9 and 
standard deviation a. We will assume that the standard deviation a is known and will seek 
to evaluate the mean 9 first using a Bayesian analysis and then comparing it with an ABC 
analysis. 

3.1 Bayesian analysis 

The probability distribution function (PDF) of a single sample yi given that the mean is 9 
is P{yi\9) = /((j\/^). Since the variables yi are independent the likelihood is 

given by the joint probability 

p{y\Q) = f\p{yi\Q) = (o-\/^) exp - ^ ^^'2 P 

i=l _ i=\ 

Assuming a flat prior (i.e. that p{9) oc constant), the normalised posterior probability 
is 


(3.1) 


p{e\y) = 


n 




1 

\ 2 

"S 

1 

Si 

to 

) exp 

2cr2 


(3.2) 


where y — mean of the data points. Thus, in this simple case, the 

posterior probability distribution of the parameter 9 is, a Gaussian with mean equal to the 
mean of the data and standard deviation 


3.2 ABC analysis 

Let us now pretend that we do not have the analytical expression for the likelihood (eq. 3.1). 
We instead use ABC to estimate the posterior. For this purpose, we consider the average of 
the data points as a summary statistic 


S{y) = y (3.3) 

We further consider the distance between the data y and a simulated data set x = 
(xi, X 2 , Xji) to be defined as 

p[S'(a;),S'(y)] = \S{x) - S{y)\ = \x - y\ (3.4) 
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For a given 0, the average of the simulated data x is distributed like a Gaussian distribu¬ 
tion with mean 9 and standard deviation a j^/n. By considering the condition p — p{x^ y) < ^ 
and for a flat prior, one can then show that the ABC approximation to the posterior for a 
given threshold e has the analytic form [59] 


p{e\p <e) = 


1 

Ye 


$ 


/y-e + e\ 

V J 



(3.5) 


where the cumulative distribution function (CDF) of the normal distribution can be 
expressed in terms of the error function as $(t) = [l + erf(t/\/2)] /2. By Taylor expanding 
it is simple to show that p{9\p < e) is equal to the Bayesian posterior p{9\y) (eq. 3.2) in 
the limit e ^ 0. 

One can show that the expectation value of the posterior distribution is E[9\p < e] = 
while its variance is 


var[6'|p < e] = — + ^. (3.6) 

77 / O 

Thus in the limit 6^0 the ABC variance for 9 (eq. 3.6) reduces to the variance 
of the Bayesian posterior a‘^/n (eq. 3.2). On the other hand as e is increased, the ABC 
variance increases and so does the ABC acceptance rate, highlighting the trade off between 
the precision of the estimation of the parameter and the computational cost. Note that for 
large values of e the variance diverges due to our improper flat prior. 

3.3 Results 

We use algorithm 1 with data y = {^ 1 , 7 / 2 , 7/^} defined as tT/ = 10^ samples randomly drawn 

from a normal distribution with mean 9 = 1 and standard deviation a = 1. As prior, we 
define p{9) ex constant for —5 < 0 < 5 and consider the distance p{S{x), S{y)) defined in eq. 
3.4. To generate simulated data x we use a normal distribution with mean 0* and standard 
deviation a. We set the initial threshold to cq = 0.5 and gradually decrease the threshold 
using cr = 90 percentile of the particle pool. In each iteration, we create N = 2000 particles 
and apply the multivariate normal kernel described in section 2.2. 

Figure 1 shows the analytical posterior p{9\p < et) (green line) given by eq. 3.5 and 
a kernel density estimation (KDE) of the ABC posterior (blue line). Each panel depicts 
the results of different iterations t with its corresponding decreasing threshold value et (only 
even iterations are displayed). The sampled posteriors are in good agreement with the 
analytical prediction and for small values of et they are close to the expected Bayesian 
distribution (red distribution in the figure). Further decrease of the threshold beyond 0.01 
does not significantly change the posterior, which indicates the convergence of the estimation. 
This finding is supported by figure 2, which shows the expected (eq. 3.6, green line), the 
approximated (blue line) and the Bayesian (red line) variance as a function of e. 

4 Application to image modelling 

The Ultra Fast Image Generator (UFig) is a wide-held image simulation software package 
that generates realistic images and was optimised for fast computation [51]. In [33] UFig 
was used to generate image simulations with statistical properties consistent with observed 
images from the Dark Energy Survey (DES) [60]. This was done in the MCCL (Monte Carlo 
Control Loops) [32] framework by tuning the input parameters of the simulations. 
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Figure 1. Posterior distribution of the Gaussian toy model at different iterations t and corresponding 
threshold values The numerical ABC PMC posterior is represented using a kernel density estimator 
(blue line) and is in good agreement with the analytical prediction (eq. 3.5, green line). As a 
comparison, the expectation for a Bayesian analysis is shown in red, which is the same function in all 
panels and was clipped when convenient. 


We now study how the ABC scheme can be applied to this problem, further details of 
which can be found in [33]. Rather than comparing the images at pixel level, we first analyse 
them with the widely used Source-Extractor package [61], which produces a catalog of 
identified objects (e.g. stars and galaxies) and their properties (e.g size, flux and shape 
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Figure 2. Variance of ^ as a function of the threshold values e for the Gaussian toy model. The 
numerical ABC PMC variance (blue line) is in good agreement with the analytical prediction (green 
line) form eq. 3.6 and with the Bayesian case (red line) as e ^ 0. 

parameters). These catalogs are used to construct summary statistics of the images and a 
distance metric between the simulations and a target image. The ABC iterative framework 
thus has the following steps: 

Propose The ABC PMC algorithm chooses and perturbs a particle from the pool and 
creates a new particle 0* that represents a new position in parameter space. 

Create image UFig is parameterized using the proposed values in 0* and generates a new 
image. 

Extract objects Source-Extractor extracts the objects from the image, estimates their 
properties and generates a catalog. 

Postprocessing The Source-Extractor catalog is post-processed to remove unphysical 
outliers and other artifacts and to compute derived properties such as object ellipticities. 

Comparison This catalog is then compared to the one from the target image using the 
distance metric. 

This process is repeated for all ABC particles in each iteration while the threshold is 
gradually lowered until convergence is reached. 

4.1 Distance metric 

In various practical applications, the impact of the choice of the distance metric on the 
approximated posterior have been studied. While a good choice plays an important role, 
sometimes even simple metrics have yielded good results [40, 41, 47, 50, 62]. As summary 
statistics and their corresponding distance are typically problem specific, it is difficult to 
define metrics that are generally applicable. Instead domain knowledge and heuristics are 
being used in practice to define appropriate statistics. In [63], the authors reviewed techniques 
that can be used for the definition of summary statistics. 


- 10 - 














MAG_BEST FLUX_RADIUS ELLIPTICITY_E1 ELLIPTICITY_E2 


Figure 3. Marginal distributions of the four selected colums of the UFig Source-Extractor catalog 
from the target image. The plot shows the non-gaussianity and the non-linear correlations of the data 
set. Created with triangle.py [65] 


In the following we discuss distance metrics for Source-Extractor catalogs, where the 
statistical properties of multidimensional samples have to be compared. Quantifying the 
discrepancy between two multidimensional statistical distributions is non-trivial [see e.g. 64]. 
For univariate data sets, various statistical techniques have been developed to determine if two 
sets follow the same underlying PDF. A prominent example is the two-sample Kolmogorov- 
Smirnov test (KS test). Applying a KS test to multivariate data is not directly possible, 
especially beyond two dimensions. Applying the test to every dimension individually is 
typically insufficient, as correlations between different parameters are not taken into account. 
This is problematic for the image modeling application since, as we will see below, the 
object properties in the Source-Extractor catalogs are typically numerous and non-linearly 
correlated. 

Diverse methods founded in information theory exist to quantify the difference between 
two multivariate distributions. These include the Kullback-Leibler divergences and its sym¬ 
metrized variant the Jensen-Shannon divergence [66]. Both methods require the estimation 
of the underlying PDF. A common way to do this is to use a nearest neighbor or a kernel 
density estimator. However, both estimation methods tend to introduce an unwanted noise 
and bias in the distance measure [67] . Furthermore estimating the underling PDF is difficult 
in higher dimensions and is typically computationally intensive. Another approach is to de¬ 
fine a distance metric between two multivariate data sets based on the Mahalanobis distance 
[52]. We find that the Mahalanobis distance approach provides better constraints on the 
posterior while being computationally less demanding. For this reasons we opt for the latter 
in the following. 
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The Mahalanobis distance between data vector y (in our case derived from the target 
image) and a simulated data vector x (from proposed simulated image) is based on the 
summary statistic, 


S{y) = \/{y- fJ.yVT,y^{y - fly) 

(4.1) 

Six) = J{x- Hy)'^T.pix - Hy), 

(4.2) 


where jiy is the mean of y and its covariance matrix. Note that in eq. 4.2 the data 
X is compared to the center and covariance matrix of the observed data set y. As 
S{x) and S{y) are one-dimensional projections of x and y the distance p{S{x)^S{y)) can be 
set to the standard one dimensional KS test for two-samples. In other words, the projection 
is the distribution of the distances to the center of the multidimensional distribution of the 
observed data set while taking into account the correlation in the data sets. The distance 
between the two projections is the maximal difference between the cumulative distribution 
functions (CDF) of their Mahalanobis distances. 

As the Mahalanobis distance (eq. 4.1 and 4.2) relies on the mean and the covariance 
matrix it is important to note that the proposed distance p{S{x),S{y)) works best on uni- 
modal distributions. For data sets that heavily differ from a Gaussian, such as multimodal 
distribution for example, the distance might introduce an unwanted bias and yield improper 
approximations. 

4.2 Results 

In this section, we combine the ABC PMC algorithm described in section 2.2 and the Maha¬ 
lanobis distance metric to constrain the UFig simulation parameters to mimic a given target 
image. 

For this purpose we generated a target image using the parameters 9 shown in the 
second column of table 1 [see 33, 51, for details]: 

• size-sigma defines the root mean square of the size (r50) distribution of galaxies in 
arcsec, 

• size-theta is the correlation angle for size-magnitude distribution of galaxies, 

• el-sigma and e2-sigma are root mean square of the two components of the galaxy 
ellipticities el and e2. 

For the example explored here all the other simulation parameters are kept fixed to val¬ 
ues similar to those in [51]. We run the ABC PMC algorithm 1 with the OLCM permutation 
kernel from Section 2.3 and N = 400 particles on these four simulation parameters. The 
initial threshold cq was set to 0.2 and automatically reduced by using cr = 90 percentile. To 
calculate the distance metric we used the following Source-Extractor catalog columns: 

• MAG-BEST is the estimated magnitude of the objects, 

• FLUX-RADIUS is the half light radius of an object, 

• ELLIPTICITY-El and ELLIPTICITY-E2 are the two components of ellipticities of the 
objects. 
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Table 1. Target image parameter configuration, prior definition and ABC posterior. The uncertain¬ 
ties corresponds to one standard deviation. 


Parameter 

Target 

Prior 

ABC 

size-theta 

0.14 

0.15±0.03 

0.1420±0.0047 

size-sigma 

0.23 

0.21±0.07 

0.2376±0.0154 

el-sigma 

0.25 

0.26±0.07 

0.2435±0.0119 

e2-sigma 

0.25 

0.24±0.07 

0.2585±0.0104 


The one- and two-dimensional marginal distributions of these four catalog properties 
from the target image, are shown in figure 3. As stated earlier, these are highly and non- 
linearly correlated. As priors, we use a component-wise Gaussian distributions with means 
and standard deviations shown in the third column of table 1. The widths of the priors are 
choosen such that they exeed the 95% confidence limits shown in table 2 in the appendix of 
[33]. Furthermore, the means of the priors have been defined to be shifted by at least la 
from the true target value 

The calculation was parallelized on 200 Intel Ivy Bridge EP E5-2660v2 2.2 GHz cores 
and resulted in a wall time of approximately 20 hours. 

Figure 5 shows the marginal distributions of the ABC posterior. The blue lines denote 
the true parameter values used to generate the target image, which are consistent with the 
approximated posterior. The approximated posterior is well behaved and displays correla¬ 
tion between different parameters. The means and standard deviations of the parameters 
estimated from the approximate posterior are shown in the fourth column of table 1. The 
ABC PMC algorithm was thus able to refine the prior information and correctly moved the 
mean towards the target parameter values and reduced the errors on all parameters. The 
estimated errors are in grood agreement with the 95% confindence limit in [33] Figure 4 
shows the behavior of the thresholds et and the acceptance ratio as a function of the number 
of iterations for the normal multivariate and the OLCM kernel. The OLCM kernel allows 
for a faster decrease of the threshold while having a higher acceptance ratio. 

5 Conclusion 

We explored how Approximate Bayesian Computation can be applied to forward modeling 
in cosmology, where the likelihood is unavailable or intractable. We discussed a common 
implementation of the ABC algorithm, the Population Monte Carlo (PMC) algorithm, which 
is a combination of Sequential Monte Carlo and particle filtering. The algorithm performs 
the approximation of the posterior distribution by using a pool of particles that are within 
a certain distance threshold. The solution is iteratively improved by gradually lowering this 
threshold value. We discuss several considerations for good performance of the algorithm such 
as an automatic reduction of the acceptance threshold and the impact of particle permutation 
kernels. 

We apply the ABC PMC algorithm to a Gaussian toy model as well as to the calibration 
of an image generated with the simulation software (UFig). We show that the analytical pre¬ 
dictions for the toy model are in good agreement with our empirical results and approach the 
Bayesian posterior in the limit of small thresholds. For the image calibration application, we 
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Figure 4. The threshold e and the acceptance ratio as a function of the iterations for the UFig image 
modeling. The blue line shows the multivariate kernel (MVK)[39] and the green line denotes the 
optimal local covariance matrix kernel (OLCM) [58]. 


introduce a distance measure based on a parameter space projection using the Mahalanobis 
distance that can be used to measure the discrepancy between two multivariate distributions. 
To assess the goodness of the inferred solution, we have generated an image with known con¬ 
figuration and compared the estimated posterior with the input configuration. We find that 
ABC produced a reliable approximate posterior that was consistent with the input param¬ 
eter values and mapped the correlation between simulation parameters. The ABC method 
with its PMC implementation is thus promising for numerous forward modeling problems in 
cosmology and astrophysics. 
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A Package distribution 

Detailed documentation, examples and installation instructions for the ABC PMC imple¬ 
mentation can be found on the package website http://abcpmc.readthedocs.org/. The 
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Figure 5. The one- and two-dimensional marginal distributions of the approximate UFig parameter 
posterior. The blue lines denote the true initial parameter configuration. Created with triangle.py 
[65] 


package is released under the GPLv3 license and has been uploaded to PyPI^ and can be 
installed using pip^: 

$ pip install abcpmc —user 

This will install the package and all of the required dependencies. The development 
is coordinated on GitHub http://github.com/jakeret/abcpmc and contributions are wel¬ 
come. 

The package is entirely written in Python and contains the algorithm 1 as well as various 
threshold schemes, prior implementations and different pertubation kernels. The code is built 
with a flexible design such that one can easily extend the provided functionality. 
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